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Abstract. We present a new approach for performing predicate abstraction based on 
symbolic decision procedures. Intuitively, a symbolic decision procedure for a theory takes 
a set of predicates in the theory and symbolically executes a decision procedure on all 
the subsets over the set of predicates. The result of the symbolic decision procedure is a 
shared expression (represented by a directed acyclic graph) that implicitly represents the 
answer to a predicate abstraction query. 

We present symbolic decision procedures for the logic of Equality and Uninterpreted 
Functions (EUF) and Difference logic (DIFF) and show that these procedures run in 
pseudo-polynomial (rather than exponential) time. We then provide a method to construct 
symbolic decision procedures for simple mixed theories (including the two theories men- 
tioned above) using an extension of the Nelson-Oppen combination method. We present 
preliminary evaluation of our Procedure on predicate abstraction benchmarks from device 
driver verification in SLAM. 



1. Introduction 

Predicate abstraction is a technique for automatically creating finite abstract models 
of finite and infinite state systems |GS97j . The method has been widely used in abstracting 
finite-state models of programs in SLAM [BMMROTj and numerous other software veri- 



fication projects |HJMS02t [CCG+ 04]. It has also been used for synthesizing loop invari- 



ants |FQ02| and verifying distributed protocols [DDP991ILBC03] . 

The fundamental operation in predicate abstraction can be summarized as follows: 
Given a set of predicates P describing some set of properties of the system state, and a 
formula e, compute the weakest Boolean formula J'p{e) over the predicates P that implies 
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Most implementations of predicate abstraction [GS971 IBMMROT] construct J-p{e) by 
collecting the set of cubes (a conjunction of the predicates or their negations) over P that 
imply e. The implication is checked using a first-order theorem prover. This method may 
require making a very large (2l^l in the worst case) number of calls to a theorem prover 
and can be expensive. 

We propose a new way to perform predicate abstraction based on symbolic decision 
procedures. A symbolic decision procedure for a theory T (SDPt) takes sets of predicates 
G and E and symbolically executes a decision procedure for T on C U {-le | e € -EjH, for 
all the subsets G' of G. The output of SDPt{G, E) is a shared expression (an expression 
where common subexpressions can be shared) representing those subsets G' Q G, for which 
G' U {-le I e € E} is unsatisfiable. We show that such a procedure can be used to compute 
J-p(e) for performing predicate abstraction. 

We present symbolic decision procedures for the logic of Equality and Uninterpreted 
Functions (EUF) and Difference logic (DIF) and show that these procedures run in poly- 
nomial and pseudo-polynomial time respectively, and therefore produce compact shared 
expressions. We provide a method to construct SDP for a combination of two simple 
theories Ti U T2 (including EUF -|- DIF), by using an extension of the Nelson-Oppen com- 
bination [NQ80] method. We use Binary Decision Diagrams (BDDs) |Bry86| to construct 
J'p{e) from the shared representations efficiently in practice. 

We present a preliminary evaluation of our procedure on predicate abstraction bench- 
marks from device driver verification in SLAM, and show that our method outperforms 
existing methods for doing predicate abstraction. 

The rest of the paper is organized as follows: Section 11.11 describes related work in 
predicate abstraction techniques. Section [2] describes the background concepts including 
predicate abstraction. Section [3] describes symbolic decision procedures, and instantiates it 
for two different theories (EUF and DIF) . Section [4] describes a framework for modular ly 
combining the SDPs for two theories that satisfy certain requirements, using an extension 
of the Nelson-Oppen combination method. Section [5] describes the implementation and the 
experimental evaluation of our technique. Finally, we present the conclusions and future 
work in Section [6l 

1.1. Related Work. Several techniques have been suggested to improve the performance of 
predicate abstraction. The techniques can be broadly classified into three categories: In the 
first category, we classify methods that treat the decision procedures as a "black box" , and 
attempt to minimize the number of decision procedure calls during predicate abstraction. 
The second category consists of methods that use a quantifier elimination procedure to 
perform predicate abstraction. Finally, there are techniques that do not compute the most 
precise abstract directly; instead, they rely on counterexamples or proofs in the overall 
verification process to refine the abstraction. In the following paragraphs, we describe these 
techniques in more details. 

The techniques that aim to reduce the number of calls to the theorem prover or decision 
procedure are mostly based on enumerating cubes over P in an increasing order of their 
size. Das et al. [DDP99j enumerates cubes over a tree, after fixing the order of predicates 

"'^The dual of this problem, which is to compute the strongest Boolean formula Qp{e) that is implied by 
e, can be expressed as -iJ-p{-^e). 

Throughout this paper, we interpret a set of expressions to be a conjunction over the expressions in the 

set. 
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that appear in any path to the leaves. If a cube is found unsatisfiable, then all its sub- 
cubes (represented by the subtree) are pruned off. This method may require 2l^l+^ calls to 
the theorem prover in the worst case. Saidi and Shankar |SS99| relaxes the order on the 
predicates, and enumerate all possible cubes {3^^^ of them) over the predicates. Flanagan 
and Qadeer |FQ02| provide an algorithm that searches over the 21'^ | clauses (disjunction 
of cubes over the predicates or their negations) of size \P\, but attempts to greedily grow 
the clause (by dropping literals) when such a clause is implied by the formula e. Their 
technique requires |P|.2l''^l theorem prover calls in the worst case. Other techniques sacrifice 
precision to gain efficiency, by only considering cubes of some fixed length [BMMROT] . All 
these techniques may require an exponential number of theorem prover calls in the worst 
case, and demonstrate worst case behavior in practice. However, more importantly, since 
these queries are not incremental, the state of the prover has to be reset across each call, 
precluding any learning across calls. 

Alternately, predicate abstraction can be formulated as a quantifier elimination prob- 
lem. Lahiri et al. |LBC03 ] and Clarke et al. [CKSYOi] perform predicate abstraction by 
reducing the problem of computing Tp{e) to Boolean quantifier elimination. The former 
method first transforms a first-order quantifier elimination problem into Boolean quanti- 
fier elimination by encoding first-order formulas into Boolean formulas; the latter assumes 
all variables are propositional. The method in [LBC03j first converts the quantifier-free 
first-order formula to a Boolean formula such that the translation preserves the set of sat- 
isfying assignments of the Boolean variables in the original formula. Both these techniques 
use incremental Boolean Satisfiability (SAT) techniques [CKSYOil IMcM02j to perform the 
Boolean quantifier elimination. These techniques have the benefit that the large number of 
calls to the theorem prover is avoided, and learning can be used to prune away the search 
space in the SAT solver. However, the translation from a first-order formula to a Boolean 
formula can result in a loss of structure (since the arithmetic operations are encoded as 
bitwise operations), and make the translation inefficient. Namjoshi and Kurshan [NKOO] 
also proposed using quantifier elimination for first-order logic directly to perform predicate 
abstraction — however many theories (such as the theory of Equality with Uninterpreted 
Functions) do not admit quantifier elimination. 

Most of the above approaches use decision procedures or SAT solvers as "black boxes" , 
at best in an incremental fashion, to perform predicate abstraction. We believe that having 
a customized procedure for predicate abstraction can help improve the efficiency of predicate 
abstraction on large problems. 

Finally, there are a set of techniques to avoid computing the most precise abstraction 
upfront, and refine it only based on failed proof attempts in the verification tool. Das 
and Dill [DDOlj and subsequently Ball et al. |BCDR04] use counterexamples to refine the 
predicate abstraction incrementally. Jhala and McMillan |JM05] use interpolants to refine 
the predicate abstraction. It is not clear if it is always preferable to compute the abstraction 
incrementally. But, we have observed that the refinement loop can often becomes the main 
bottleneck in these techniques (for example in SLAM), and limits the scalability of the 
overall system |BCDR04] . 

2. Setup 

Figure [1] defines the syntax of a quantifier- free fragment of first-order logic. An expres- 
sion in the logic can either be a term or a formula. A term can either be a variable or an 
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term ::= variable \ function-syrnbol{term, term) 

atomic-formula .:= term = term \ predicate- symhol{term, . . . , term) 

formula ::= true | false | atomic-formula 

I formula A formula \ formula V formula | formula 

Figure 1: Syntax of a quantifier-free fragment of first-order logic. 



application of a function symbol to a list of terms. A formula can be the constants true 
or false or an atomic formula or Boolean combination of other formulas. Atomic formulas 
can be formed by an equality between terms or by an application of a predicate symbol to 
a list of terms. 

The function and predicate symbols can either be uninterpreted or can be defined by a 
particular theory. For instance, the theory of integer linear arithmetic defines the function- 
symbol "-|-" to be the addition function over integers and "<" to be the comparison predicate 
over integers. If an expression involves function or predicate symbols from multiple theories, 
then it is said to be an expression over mixed theories. 

A formula F is said to be satisfiable if it is possible to assign values to the various 
symbols in the formula from the domains associated with the theories to make the formula 
true. A formula is valid if is not satisfiable (or unsatisfiable) . We say a formula A 
implies a formula B (A ^ B) if and only if {^A) y B is valid. 

We define a shared expression to be a Directed Acyclic Graph (DAG) representation 
of an expression where common subexpressions can be shared, by using names to refer to 
common subexpressions. For example, the intermediate variable t refers to the expression 
ei in the shared expression "let t = ei in (e2 At) V (es A -■t)". 

2.1. Predicate Abstraction. A predicate is an atomic formula or its negatiorH. If G is a 
set of predicates, then we define G = {^g \ g G G}, to be the set containing the negations of 
the predicates in G. We use the term "predicate" in a general sense to refer to any atomic 
formula or its negation and should not be confused to only mean the set of predicates that 
are used in predicate abstraction. 

Definition 2.1. For a set of predicates P, a literal li over P is either a predicate pi or 
-ipj, where pi ^ P. A cube c over P is a conjunction of literals. A clause cl over P is a 
disjunction of literals. Finally, a minterm over P is a cube with |P| literals, and exactly 
one of Pi or -ipj is present in the cube. 

Given a set of predicates P = {pi, . . . ,pn} and a formula e, the main operation in 
predicate abstraction involves constructing the weakest Boolean formula J'p{e) over P such 
that J^p{e) =^ e. The expression Tp{e) can be expressed as the set of all the minterms over 
P that imply e: 



Proposition 2.2. For a set of predicates P and a formula e, the following statements are 
true: 



:Fp{e) = V{ 



c I c is a minterm over P and c implies e} 




'We always use the term "predicate symbol" (and not "predicate") to refer to symbols like "<". 
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X = Y 


X = Y X + Y 


Y = X 


_L 


X = Y Y = Z 


X\=Y\ ■ ■ ■ Xn = Yn 


X = Z 


/(Xi,... ,x„) = /(yi,.-- 


Figure 2: Inference rules for theory of equality and uninterpreted functions. 



(1) Tpic^e) ^Tp{e), 

(2) Tp{ex A es) ^ ^p(ei) A Tp{e2), and 

(3) ^p(ei) V ^p(e2) ^ Tp{ei V ea) 

Proof. These properties follow very easily from the definition of J- p. 

We know that J'p{e) =^ e, by the definition of J-p{e). By contrapositive rule, -le 
-ij^p(e). But ^p(-ie) =^ -le. Therefore, J^p{^e) =^ ^Tp{e). 

To prove the second equation, we prove that (i) !Fp{ei A 62) =^ (^p(ei) A /p(e2)), and 
(ii) (J^p(ei) A J^p(e2)) =^ Tp(ex A 62). Since ei A 62 ^ Ci (for i G {1,2}), J^p(ei A 62) ^ 
Tp{ei). Therefore .Fp(ei A 62) =^ {Tp{e\) A Tp(e2)). On the other hand, Tp(e\) ei and 
•^p(e2) 62, Tp(e\) A Tpie-i) ^ ei A 62- Since Tp{e\ A 62) is the weakest expression that 
implies ei A 62, ^p(ei) A Tp(e2) ^ ^pi^i A 62). 

To prove the third equation, note that Tp{ei) V Tp(e2) ei V 62 and Tp(e\ V 62) is 
the weakest expression that implies e\ V 62 . 

□ 

The operation Tp{e) does not distribute over disjunctions. Consider the example where 
P = {x / 5} and e = x<5Vx>5. In this case, J^p{&) = x ^ b. However !Fp{x < 5) = 
false and J-p{x > 5) = false and thus {J-p{x < 5) V J-p{x > 5)) is not the same as JFp(e). 

The above properties suggest that one can adopt a two-tier approach to compute J^p{e) 
for any formula e: 

(1) Convert e into an equivalent Conjunctive Normal Form (CNF), which comprises of 
a conjunction of clauses, i.e., e = (/\j ck). 

(2) For each clause ck = {e\ V e| . . . V e^), compute = !Fp{cli) and return J^p{e) = 

To obtain an equivalent CNF form, one cannot introduce auxiliary variables (to keep 
the size of the resulting formula linear in the size of the input formula), as is typically 
done during an equisatisfiable CNF translation. These auxiliary variables introduced have 
to be existentially quantified out to obtain an equivalent formula. In our case, the CNF 
representation of the formula can be exponentially large compared to the original formula. 
However, we can use recent techniques to obtain the CNF form lazily, by a method proposed 
by McMillan [ McM0 2]. 

For the rest of hte paepr, we focus here on computing ^p{\/ei£E '^heii is a pred- 
icate. Unless specified otherwise, we always use e to denote (y^.^^Si), a disjunction of 
predicates in the set E in the sequel. 
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3. Symbolic Decision Procedures (SDP) 

We now show how to perform predicate abstraction using symbohc decision procedures. 
We start by describing a saturation-based decision procedure for a theory T and then use 
it to describe the meaning of a symbohc decision procedure for the theory T. Finahy, we 
show how a symbohc decision procedure can yield a shared expression of J~p{e) for predicate 
abstraction. 

A set of predicates G (over theory T) is unsatisfiable if the formula {/\g^Q g) is unsat- 
isfiable. For a given theory T, the decision procedure for T takes a set of predicates G in 
the theory and checks if G is unsatisfiable. A theory is defined by a set of inference rules. 
An inference rule R is of the form: 



A2 ... An 

A 



(R) 



which denotes that the predicate A can be derived from predicates Ai, . . . , An in one step. 
Each theory has at least one inference rule for deriving contradiction (±). We also use 
g : — gi, . . . , to denote that the predicate g (or _L, where g = -L) can be derived from the 
predicates gi, . . . , using one of the inference rules in a single step. Figure [2] describes the 
inference rules for the theory of Equality and Uninterpreted Functions. 



3.1. Saturation based decision procedures. Consider a simple saturation-based pro- 
cedure DPt shown in Figure [3l that takes a set of predicates G as input and returns 

SATISFIABLE or UNSATISFIABLE. 

The algorithm maintains two sets: (i) W is the set of predicates derived from G up 
to (and including) the current iteration of the loop in step (2); (ii) W is the set of all 
predicates derived before the current iteration. These sets are initialized in step (1). Dur- 
ing each iteration of step (2), if a new predicate g can be derived from a set of pred- 
icates {giT--,gk} ^ W, then g is added to W. The loop terminates after a bound 
derivDepthrpiG). In step (3), we check if any subset of facts in W can derive contra- 
diction. If such a subset exists, the algorithm returns unsatisfiable, otherwise it returns 

SATISFIABLE. 

The parameter d = derivDepthj^{G) is a bound (that is determined solely by the set 
G for the theory T) such that if the loop in step (2) is repeated for at least d steps, then 
DPt{G) returns unsatisfiable if and only if G is unsatisfiable. If such a bound exists for 
any set of predicates G in the theory, then DPt procedure implements a decision procedure 
for T. 

Definition 3.1. A theory T is called a hounded saturation theory, if the procedure DPt 
described in Figure [3] implements a decision procedure for T. 

In the rest of the paper, we only consider bounded saturation theories. Since there is 
no ambiguity, we will drop the term "bounded" in the rest of the paper and refer to such 
a theory as saturation theory. To show that a theory T is a saturation theory, it suffices 
to consider a decision procedure algorithm for T (say At) and show that DPt implements 
At- This can be shown by deriving a bound on derivDepthT{G) for any set G in the theory. 
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(1) Initialize W ^G. W ^ {}. 

(2) For i = 1 to derivDepthrp^G): 

(a) Let W ^ W. 

(b) For every fact g W, if {g : — gi, . . . , gk) and gm G W for all m € [1, A;]: 

• W ^WU{g}. 

(3) If (_L : - 51, . . . ,gk) and gm for all m G [1, fc]: 

• return UNSATISFIABLE 

(4) else return satisfiable 

Figure 3: DPt{G): A simple saturation-based procedure for theory T. We use m G 
to denote i < m < j . 



3.2. Symbolic Decision Procedure. For a (saturation) theory T, a symbolic decision 
procedure for T (SDPt) takes sets of predicates G and E as inputs, and symbolically 
simulates DPt on G' U E, for every subset G' C G. The output of SDPt{G, E) is a 
symbolic expression representing those subsets G' C G, such that C U £' is unsatisfiable. 
Thus with |G| = n, a single run of SDPt symbolically executes 2" runs of DPt- 

We introduce a set of Boolean variables Bq = {bg \ g G G}, one for each predicate 
in G. An assignment a : Sfj ~^ {true, false} over Bq uniquely represents a subset 
G' = {g\ a{bg) = true} of G. 

Figure [H presents the symbolic decision procedure for a theory T, which symbolically 
executes the saturation based decision procedure DPt on all possible subsets of the input 
component G. Just like the DPt algorithm, this procedure also has three main components: 
initialization, saturation and contradiction detection. The algorithm also maintains sets W 
and W, as the DPt algorithm does. 

Since SDP{G, E) has to execute DPt{G' U E) on all G' C G, the number of steps to 
iterate the saturation loop equals the maximum derivDepthT{G' U E) for any G' C G. For 
a set of predicates S, we define the bound maxDerivDepthT{S) as follows: 

maxDerivDepthT{S) = max{derivDepthT{S') | S" C S} 

During the execution, the algorithm constructs a set of shared expressions with the 
variables over Bq as the leaves and temporary variables t[-] to name intermediate expres- 
sions. We use t[(g, i)] to denote the expression for the predicate g after the iteration i of 
the loop in step (2) of the algorithm. We use t[(5,T)] to denote the top-most expression 
for g in the shared expression. Below, we briefly describe each of the phases of SDPt- 

: Initialization [Step (1)]. The set W is initialized to G U E and W to {}. The leaves of 
the shared expression symbolically encode each subset G' U E, for every G' Q G. For each 
g G G, the leaf 0)] is set to bg. For any G E, since -le^ is present in all possible 
subset G' U E, we replace the leaf for -le^ with true. 

: Saturation [Step (2)]. For each predicate g, S{g) is the set of derivations of g from 
predicates in W' during any iteration. For any predicate g, we first add all the ways 
to derive g until the previous steps by adding t[{g,i — 1)] to S{g). Every time g can be 
derived from some set of facts gi, . . . ,gk such that each gj is in W', we add this derivation 
to S{g) in Equation 13. 1[ At the end of the iteration i, t[(g,i)] and t[(g,T)] are updated 
with the set of derivations in S{g). The loop is executed maxDerivDepthT{G U E) times. 
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(1) Initialization 

(a) W ^GUE and W ^ {}. 

(b) For each g eG, t[{g,0)] ^ bg. 

(c) For each G E, t[{^ei,0)] ^ true. 

(2) For i = 1 to maxDerivDepthrp{G U E) do: // Saturation 

(a) W ^ W. 

(b) Initiahze S{g) = {}, for any predicate g. 

(c) For every g G W, Sig) ^ Sig) U {t[{g,i- 1)]}. 

(d) For every g, ii {g : - gi, . . . , gk) and gm e W for all m G [1, A;]: 

(i) Update the set of derivations of g at this level: 

S{9)^S{g)U{l /\ (3.1) 

\mG[l,fc] / 

(ii) VF^VFU{5}. 

(e) For each geW: t[{g, i)] ^ \/deS{g) ^ 

(f) For each gGW, t[ig, T)] ^ t[{g, i)] 

(3) Check for contradiction: 

(a) Initialize 5(e) = {}. 

(b) For every {gi, . . . , gk} C W , if {± : - gi, . . . ,gk) then 

S{e)^S{e)yj{{ l\ t[{g^,T)]\} (3.2) 

\me[l,fc] / 

(c) Create the derivations for the goal e as t[e] <— {\J cieS{e) ^ 

(4) Return the shared expression for t[e]. 

Figure 4: Symbolic decision procedure SDPt{G, E) for theory T. The expression e stands 
for \I^.^E(^i- 



: Contradiction [Steps (3,4)]. We know that if G' U is unsatisfiable, then G' implies e 
(recall, e stands for VeigB ^«)- Therefore, each derivation of _L from predicates in W gives 
a new derivation of e. The set ^(e) collects these derivations and constructs the final 
expression t[e], which is returned in step (4). 

The output of the procedure is the shared expression t[e], where the leaves of the expres- 
sion are the variables in Bq- The only operations in t[e\ are conjunction and disjunction; 
t[e\ is thus a Boolean expression (or a Boolean circuit) over Bq- The internal nodes in the 
expression are shared and can be inputs to multiple nodes in the subsequent level. We now 
define the evaluation of a (shared) Boolean expression inductively with respect to a subset 
G' c G. 

Definition 3.2. For any Boolean expression t[x\ whose leaves are in set Bq, and a set 
G' C G, we define eval{t[x\, G') as the recursive evaluation of t[x\, after replacing each leaf 
hg of t[x\ with true if 5 G C and with false otherwise. The prepositional connectives in 

the expression (A and V) arc interpreted using their standard meaning. 

The following theorem explains the correctness of the symbolic decision procedure. 
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Theorem 3.3. Ift[e] = SDPt{G, E), then for any set of predicates G' C G, eval{t[e], G') = 
true if and only if DPt{G' U E) returns UNSATISFIABLE. 

To prove Theorem 13.31 we first describe an intermediate lemma about SDPt- To 
disambiguate between the data structures used in DPt and SDPt, we use Ws and W'g 
(corresponding to symbolic) to denote W and W respectively for the SDP algorithm. 
Moreover, it is also clear that W (respectively W'g) at the iteration i (i > 1) is the same 
as W (respectively Ws) after i — 1 iterations. 

Lemma 3.4. For any set of predicates G' C G, at the end of i (i >0) iterations of the loop 
in step (2) of SDPriG, E) and DPt{G' U E) procedures: 

(1) W C Ws, and 

(2) eval{t[{g, i)], G") = true if and only if g ^W for the DPt algorithm. 

Proof. We use an induction on i to prove this lemma, starting from i = 0. 

For the base case (after step (1) of both algorithms), W = G'VJE<ZGVJE(1 Ws- 
Moreover, for this step, eval{t[[g., 0)], G') for a predicate g can be true in two ways. 

(1) If g G -E, then step (1) of SDPt assigns it to true. Therefore eval{t[{g, 0)], G') is true 
for any G' . But in step (1) of DPt{G' U -E), W contains all the predicates in G' U E, 
and therefore g &W. 

(2) If (7 G G' , then eval{t[{g, 0)], G') = eval{bg, G') which is true, by the definition of 
eval{, ). Again g G W after step (1) of the DPt algorithm too. 

Let us assume that the inductive hypothesis holds for all values of i less than m. 
Consider the iteration number m. It is easy to see that if any fact g is added to W in this 
step, then g is also added to Ws] therefore part (1) of the lemma is easily established. 

To prove part (2) of the lemma, we will consider two cases depending of whether a 
predicate g was present in W before the m^^ iteration: 

(1) Let us assume that after m — 1 iterations of DPt{G' U E) procedure, g G W. Since g is 
never removed from W during any step of DPt, g &W after m iterations too. Now, by 
the inductive hypothesis, eval{t[{g, m — 1)], G') = true. However, t[{g,m — 1)] 
t[{g,m)] (because t[{g,m)] contains t[{g,m — 1)] as one of its disjuncts in step 2(c) of 
the SDPt algorithm). Therefore, eval{t[{g, m)], G') = true. 

(2) We have to consider two cases depending on whether g can be derived in DPt{G' U E) 
in step m. 

(a) If g can't be derived in this step in DPt algorithm, then there is no set {gi, . . . , (j^} C 
W' (of DPt) such that g : — gi, . . . ,gk. Since W' is the same as W after m — 1 
iterations, we can invoke the induction hypothesis to show that there exists a pred- 
icate gj S {gi, . . . ,gk}, eval{t[{gj, m — 1)], G') = false. Again, by the induction 
hypothesis, eval{t[{g,m — 1)], G') = false, since g ^W after m — 1 steps. Thus 
eval[t[{g, m)], G') = false. 

(b) If g can be derived from {gi, . . . , g^} C W' (of DPt), then /\j t[{gj,m — 1)] implies 
t[{g,m)]. But for each gj G {gi, ■ ■ ■ , 9k}, eval{{gj,m — 1),G') = true and thus 
eval{{g, m), G') = true. 

This completes the induction proof. □ 
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We are now ready to complete the proof of Theorem 13.31 

Proof. Consider the situation where both SDPt{G, E) and DPt{G' U E) have executed the 
loop in step (2) for i = maxDerivDepthrpiG U E). We will consider two cases depending on 
whether _L can be derived in DPt{G' U E) in step (3). 

• Suppose after i iterations, there is a set {gi, . . . , gk} ^ W, such that _L : — gi, . . . , g^- This 
implies that G' U E is unsatisfiable. By Lemma 13.41 we know that eval{t[{gj ,T)], G') = 
true for each gj G {gi, . . . ,gk}, and therefore eval{t[e], G') = true. 

• On the other hand, let eval(t[e], G') = true. This implies that there exists a set 
{gi, ■ ■ ■ ^gk} ^ Ws, such that _L : — gi,...,gk and eval{t[{gj ,T)], G') = true for each 
9j € {gi, . . . , gk}- By Lemma 13.41 we know that {gi, . . . , gk} € W, for the DPt proce- 
dure too. This means that DPt{G' U E) will return unsatisfiable. 

This completes the proof. □ 

Corollary 3.5. For a set of predicates P, if t[e] = SDPt{P U P,E), then for any P' C 
(P U P) representing a minterm over P (i.e. pi G P' iff -^pi P'), eval{t[e], P') = 
eval{Tp{e),P'). 

Hence t[e] is a shared expression for Tp{e), where e denotes VeiS-B^*- explicit 
representation of J^p{e) can be obtained by first computing t[e] = SDPt{P U P,E) and 
then enumerating the cubes over P that make t[e] true. 

In the following sections, we will instantiate T to be the EUF and DIF theories and 
show that SDPt exists for such theories. For each theory, we only need to determine the 
value of maxDerivDepthx{G) for any set of predicates G. 

Example 3.6. Figure [5] demonstrates the working of the SDP(G, E) for a simple example. 
The predicates in G = {a = b, b = c, a = d, d = c} and E = {a = c} are limited to equality 
and disequality predicates. For this theory T, maxDerivDepthrp{G U E) equals the lg(m), 
where m is the number of terms in G U i?. We do not show this result for equality theory 
in this paper, but prove it for the more general theory of difference logic in Section 13.41 
Therefore, we need to iterate Step (2) of the algorithm, for lg{{a, b, c, d}) = 2 steps in 
Figure [H 

First, a Boolean variable bg is introduced for each of the predicate g & G. These 
variables represent i[(5,0)] for each g G G. For each G E, we use true to represent 
t[{ei,0)]. Then the Step (2) of the algorithm is repeated for 2 steps. At each step, new 
derivations are produced from the existing set of predicates at the level. The nodes at each 
level denotes the set W for the particular iteration. Each derivation from two predicates in 
W is represented as the conjunction of the two predicates (using the diamond connective), 
and multiple derivations for a predicate (e.g. 3 ways to derive a = c for i = 2) are represented 
with multiple incoming edges to a node. 

Finally, the contradiction inference rule is used to derive contradictions (_L) at the last 
level. Since the only way to derive contradiction in this example is using a = c and a ^ c, 
this is the only derivation of _L. The expression t[e] represents the acyclic graph rooted 
at _L, whose leaves are symbols in Bq. The expression t[e] intuitively represents all the 
derivations of a = c from G. More precisely, it represents all the subsets of G that are 
inconsistent with a ^ c. 

There are a couple of observations that one can make from the previous example: 
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i = 




true 







a=c^ 








t[e] 



Figure 5: Example of SDP, where G = {a = b,b = c,a = d,d = c} and E = {a = c}. 

The diamond connective represents conjunction, and multiple incoming edges 
to a node represents a disjunction. The node corresponding to the predicate g 
at level i represents t[{g,i)]. The figure omits several nodes and edges at each 
level to make the diagram readable. 



(1) The expression t[e] is a Boolean formula with Bq as inputs and an alternation of AND 
and OR operations. There are no negations (NOT) in the formula. 

(2) Even for this simple example, there are several redundant derivations. For example, 
consider the node a = 6 in level i = 2. At this level, a = b can either be derived from 
a = h or from b = c and a = c, in the previous level. However, the derivation of a = c 
in level i = 1 already uses a = b (at level i = 0) for one of its derivations. This means 
that the set of derivations of a = 6 in level i = 2 contains redundant derivations. These 
derivations do not affect the correctness of the procedure, but simply increases the size 
of t[e]. However, as we will see in the next two sections, the size of the graph for t[e\ is 
still (pseudo) polynomially bounded for interesting theories. 

Remark 3.7. It may be tempting to terminate the loop in step (2) of SDPt{G, E) once 
the set of predicates in W does not change across two iterations. However, this would lead 
to an incomplete procedure and the following example demonstrates this. 

Example 3.8. Consider an example where G contains a set of predicates that denotes an 
"almost" fully connected graph over vertices xi, . . . G contains an equality predicate 
between every pair of variables except the edge between xi and Xn- Let E = {xi = Xn}- 

After one iteration of the SDPt algorithm on this example, W will contain an equality 
between every pair of variables including xi and x„ since xi = ,t„ can be derived from 
xi = Xi,Xi = Xn, for every 1 < i < n. Therefore, if the SDPt algorithm terminates once 
the set of predicates in W stabilizes, the procedure will terminate after two steps. 

Now, consider the subset G' = {xi = X2,X2 = X3,...,Xi = Xi^i, . . . ,Xn-i = Xn} of 
G. For this subset of G, DPt{G' U E) requires lg{n) > 1 (for n > 2) steps to derive the 
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(1) Partition the set of terms in terms (G) into equivalence classes using the G= pred- 
icates. At any point in the algorithm, let EC{t) denote the equivalence class for 
any term t G terms (G). 

(a) Initially, each term belongs to its own distinct equivalence class. 

(b) We define a procedure merge{ti,t2) that takes two terms as inputs. The 
procedure first merges the equivalence classes of ti and t2- If there are two 
terms si = f{ui, . . . , Un) and S2 = /{vi, . . . , v„) such that EC{ui) = EC{vi), 
for every 1 < i < n, then it recursively calls merge{si, 32)- 

(c) For each ti = ^2 £ G=, call merge{ti,t2). 

(2) If there exists a predicate ti 7^ t2 in G^, such that EC{ti) = EC{t2), then return 
unsatisfiable; else satisfiable. 

Figure 6: Simple description of the congruence closure algorithm. 



fact xi = Xn- Therefore SDPt{G, E) does not simulate the action of DPt{G' U E). More 
formally, we can show that eval{t[e], G') = false, but G' U E is unsatisfiable. 

3.3. SDP for Equality and Uninterpreted Functions. The terms in this logic can 
either be variables or application of an uninterpreted function symbol to a list of terms. 
A predicate in this theory is ti ~ t2, where ti is a term and ~ G {=,¥"}■ Fo^' ^ s^t G 
of EUF predicates, G= and G^ denote the set of equality and disequality predicates in G, 
respectively. Figure [5] describes the inference rules for this theory. 

Let terms {(p) denote the set of syntactically distinct terms in an expression (a term or 
a formula) 0. For example, terms {f {h{x))) is {x,h{x), f{h{x))}. For a set of predicates G, 
terms (G) denotes the union of the set of terms in any g € G. 

A decision procedure for EUF can be obtained by the congruence closure algorithm |NQ80] , 
described in Figure El 

For a set of predicates G, let m = \ terms{G)\. We can show that if we iterate the loop 
in step (2) of DPt{G) (shown in Figure [3]) for at least 3m steps, then DPt can implement 
the congruence closure algorithm. More precisely, for two terms ti and t2 in terms (G), the 
predicate ti = t2 will be derived within 3m iterations of the loop in step 2 of DPt{G) if 
and only if EC{ti ) = EC{t2) after step (1) of the congruence closure algorithm (see proof 
below) . 

Proposition 3.9. For a set of EUF predicates G, if m = \terms{G)\, then the value of 
maxDerivDepthrp[G) for the theory is hound by 3m. 

Proof. We first determine the derivDepthj'{G) for any set of predicates in this theory. 

Given a set of EUF predicates G, and two terms ti and t2 in terms {G), we need to 
determine the maximum number of iterations in step (2) of DPt{G) to derive ti = t2 (if 
G= implies ti = ^2)- 

Recall that the congruence closure algorithm(described in Figure [6|) is a decision proce- 
dure for the theory of EUF. At any point in the algorithm, the terms in G are partitioned 
into a set of equivalence classes. The operation EC{ti) = EC{t2) is used to determine if ti 
and t2 belong to the same equivalence class. 

One way to maintain an equivalence class C = {ti, . . . , t„} is to keep an equality ti = tj 
between every pair of terms in G. At any point in the congruence closure algorithm, the 
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set of equivalence classes corresponds to a set of equalities C= over terms. Then EC{u) = 
EC{v) can be implemented by checking if n = v G C=. Although this is certainly not an 
efficient representation of equivalence classes, this representation allows us to build SDPt 
for this theory. 

Let us implement the CL = merge{C=,ti,t2) operation that takes in the current set of 
equivalence classes C=, two terms ti and t2 that arc merged and returns the set of equalities 
CL denoting the new set of equivalence classes. This can be implemented using the step 
(2) of the DPt algorithm as follows: 

(1) CL ^ c= u {h = t2}. 

(2) For every term u G EC{ti), (i.e. u = ti € C=), add the predicate u = t2 to CL by 
the transitive rule n = ^2 : — = ^i^ ^i = ^2- Similarly, for every v G EC{t2), add the 
predicate v = ti to CL hy v = ti : — v = t2,t2 = ti. All these steps can be performed 
in one iteration of step 2. 

(3) For every u G EC{ti) and every v G EC (12), add the edge n = t; to CL by either of the 
two transitive rules {u = v : — u = t2,t2 = v) 01 {u = v : — u = ti,ti = v). 

(4) Return CL 

If there are m distinct terms in G, then there can be at most m merge operations, as 
each merge reduces the number of equivalence classes by one and there were m equivalence 
classes at the start of the congruence closure algorithm. Each merge requires three iterations 
of the step (2) of the DPt algorithm to generate the new equivalence classes. Hence, we 
will need at most 3m iterations of step (2) of DPt to derive any fact ti = t2 that is implied 
byG=. 

Observe that this decision procedure DPt for EUF does not need to derive a predicate 
ti = t2 from G, if both ti and t2 do not belong to terms (G). Otherwise, if one generates 
ti = t2, then the infinite sequence of predicates f{ti) = 7(^2) j /(/(^i)) = / (7(^2 )),■■■ can 
be generated without ever converging. 

Again, since maxDerivDepthT{G) is the maximum derivDepthT{G') for any subset G' C 
G, and any G' can have at most m terms, maxDerivDepthT{G) is bounded by 3m. We also 
believe that a more refined counting argument can reduce it to 2m, because two equivalent 
classes can be merged simultaneously in the DPt algorithm. □ 

3.3.1. Complexity of SDPt- The run time and size of expression generated by SDPt depend 
both on maxDerivDepthTiG) for the theory and also on the maximum number of predicates 
in W at any point during the algorithm. The maximum number of predicates in W can be 
at most m(m — l)/2, considering equality between every pair of term. The disequalities are 
never used except for generating contradictions. It is also easy to verify that the size of S{g) 
(used in step (2) of SDPt) is polynomial in the size of input. Hence the run time of SDPt 
for EUF and the size of the shared expression returned by the procedure is polynomial in 
the size of the input. 

3.4. SDP for Difference Logic. Difference logic is a simple yet useful fragment of linear 
arithmetic, where predicates are of the form x cxi y + c, where y are variables, cxiG {<, <} 
and c is a real constant. Any equality x = y + c\s represented as a conjunction oi x <y + c 
and y < X — c. The variables x and y are interpreted over real numbers. The function 
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Figure 7: 


Inference rules for Difference log 
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symbol "+" and the predicate symbols {<,<} are the interpreted symbols of this theory. 
Figure [7| presents the inference rules for this theorjQ. 

Given a set G of difference logic predicates, we can construct a graph where the vertices 
of the graph are the variables in G and there is a directed edge in the graph from x to y, 
labeled with (ixj, c) if to y + c E G. We will use a predicate and an edge interchangeably 
in this section. 

Definition 3.10. A simple cycle xi co X2 + ci, 2:2 2:3 + C2, • • • , x„ co xi + c„ (where each 
Xi is distinct) is "illegal" if the sum of the edges is d = ,^]Cj and either (i) all the edges 
in the cycle are < edges and d < 0, or (ii) at least one edge is an < edge and d < 0. 

It is well known |CLR90] that a set of difference predicates G is unsatisfiable if and 
only the graph constructed from the predicates has a simple illegal cycle. Alternately, if 
we add an edge (c<i, c) between x and y for every simple path from x to y of weight c (co 
determined by the labels of the edges in the path), then we only need to check for simple 
cycles of length two in the resultant graph. This corresponds to the rules (C) and (D) in 
Figure [71 

For a set of predicates G, a predicate corresponding to a simple path in the graph of 
G can be derived within lg{m) iterations of step (2) of DPt procedure, where m is the 
number of variables in G (see proof below). 

Proposition 3.11. For a set of DIF predicates G, if m is the number of variables in G, 
then maxDerivDepthrp{G) for the DIF theory is bound by lg{m). 

Proof. It is not hard to see that if there is a simple path x txii xi + ci,xi CO2 X2 + 
C2, . . . , x„_i txi„ y + in the original graph of G, then after lg{m) iterations of the loop in 
step (2), there is a predicate x 00' y + c in W; where c = Sjg[i „^_i]Cj and cx]' is < if at least 
one of tx]j is < and < otherwise. This is because if there is a simple path between x and y 
through edges in G with length (number of edges from G) between 2*~^ and 2*, then the 
algorithm DPt generates a predicate for the path during iteration i. 

However, DPy can produce a predicate x txi y + c, even though none of the simple paths 
between x and y add up to this predicate. These facts are generated by the non-simple paths 
that go around cycles one or more times. Consider the setG = {x<y + l,y<x — 2,x< 

^Constraints like a; ix) c are handled by adding a special variable xq to denote the constant 0, and rewriting 
the constraint as x txi + c ,SSB02] . 
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z — 1, . . .}. In this case we can produce the fact y < z — 3 from y<x — 2,x<z — 1 and 
then X < z — 2 from y < z — 3,x < y + 1. 

To prove the correctness of the DPt algorithm, we will show these additional facts can 
be safely generated. Consider two cases: 

• Suppose there is an illegal cycle in the graph. In that case, after lg{m) steps, we will have 
two facts x\x\y + c and y ^ x + d'mW such that they form an illegal cycle. Thus DPt 
returns unsatisfiable. 

• Suppose there are no illegal cycles in the original graph for G. For simplicity, let us 
assume that there are only < edges in the graph. A similar argument can be made when 
< edges are present. 

In this case, every cycle in the graph has a strictly positive weight. A predicate x \x\ y+d 
can be generated from non-simple paths only if there is a predicate x \xi y + c € G such 
that c < d. The predicate x M y + d can't be a part of an illegal cycle, because otherwise 
x CXI y + c would have to be part of an illegal cycle too. Hence DPt returns satisfiable. 
Note that we do not need any inference rule to weaken a predicate, X < Y + D : — 
X < Y + C , with C < D. This is because we use the predicates generated only to detect 
illegal cycles. If a predicate x < y + c does not form an illegal cycle, then neither does any 
weaker predicate x < y + d, where d> c. □ 

3.4.1. Complexity of SDPt- Let Cmax be the absolute value of the largest constant in the 
set G. We can ignore any derived predicate in of the form x txjy + C from the set W where 
the absolute value of G is greater than (m — 1) * Cmax- This is because the maximum weight 
of any simple path between x and y can be at most {m — 1) * Cmax- Again, let const(g) be 
the absolute value of the constant in a predicate g. The maximum weight on any simple 
path has to be a combination of these weights. Thus, the absolute value of the constant is 
bound by: 

G < min{{m - 1) * Cmax,'^g€GConst{g)} 

The maximum number of derived predicates in W can be 2 * m? * (2 * C + 1), where a 
predicate can be either < or <, with possible variable pairs and the absolute value of 
the constant is bound by G. This is a pseudo polynomial bound as it depends on the value 
of the constants in the input. 

However, many program verification queries use a subset of difference logic where each 
predicate is of the form x ixi y or x cxi c. For this case, the maximum number of predicates 
generated can be 2 * m * (m — 1 -\- k), where k is the number of different constants in the 
input. 

4. Combining SDP for saturation theories 

In this section, we provide a method to construct a symbolic decision procedure for the 
combination of saturation theories Ti and T2, given SDP for Ti and T2. The combination 
is based on an extension of the Nelson-Oppen (N-0) framework |N079| that constructs a 
decision procedure for the theory Ti U T2 using the decision procedures of Ti and T2 . 
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We assume that the theories Ti and T2 have disjoint signatures (i.e., they do not share 
any function symbol), and each theory Tj is convex and stably infinit^. Let us briefly 
explain the N-0 method for combining decision procedures before explaining the method 
for combining SDP. 

4.1. Nelson-Oppen method for Combining Decision Procedures. Given two theo- 
ries Ti and T2, and the decision procedures DPti and DPxg, the N-0 framework constructs 
the decision procedure for Ti UT2, denoted as DPtj u Tg • 

To decide an input set G, the first step in the procedure is to purify G into sets Gi 
and G2 such that Gi only contains symbols from theory Tj and G is satisfiable if and only 
if Gi U G2 is satisfiable. Consider a predicate g = p{ti, . . . in G, where p is a theory 
Ti symbol. The predicate g is purified to g' by replacing each subterm tj whose top-level 
symbol does not belong to Ti with a fresh variable Wj . The expression tj is then purified to 
t'j recursively. We add g' to Gi and the binding predicate wj = t'j to the set G2 ■ We denote 
the latter as binding predicate because it binds the fresh variable wj to a term t'j. 

Let Vsh be the set of shared variables that appear in Gi n G2. A set of equalities A 
over variables in Vsh is maintained; A records the set of equalities implied by the facts from 
either theory. Initially, A = {}. 

Each theory Tj then alternately decides if DPy. (Gj U A) is unsatisfiable. If any theory 
reports unsatisfiable, the algorithm returns unsatisfiable; otherwise, the theory Tj 
generates the new set of equalities over Vgh that are implied by Gi U a|1. These equalities 
are added to A and are communicated to the other theory. This process is continued until 
the set A does not change. In this case, the method returns satisfiable. Let us denote 
this algorithm as DPtj u ■ 

Theorem 4.1 ( [N079j ). For convex, stably infinite and signature- disjoint theories Ti and 
T2 , DPxi u Tg is a decision procedure for Ti U T2 . 

There can be at most \ Vsh\ irredundant equalities over Vsh^ therefore the N-0 loop 
terminates after | Vsh\ iterations for any input. 

4.2. Combining SDP using Nelson-Oppen method. We will briefly describe a method 
to construct the SDPtivjTz by combining SDPti and SDPt^- As before, the input to the 
method is the pair {G,E) and the output is an expression t[e]. The facts in E are also 
purified into sets Ei and E2 and the new binding predicates are added to either Gi or G2. 

Our goal is to symbolically encode the runs of the N-0 procedure for G' U E, for every 
G' C G. For any equality predicate 6 over Vsh, we maintain an expression ipg that records 
all the diff'erent ways to derive 6 (initialized to false). We also maintain an expression ■i/'e 
to record all the derivations of e (initialized to false). 

The N-0 loop operates just like the case for constructing DPtivjTz- The SDPt^ for 
each theory Tj now takes {Gi U A,Ei) as input, where A is the set of equalities over Vsh 
derived so far. In addition to computing the (shared) expression t[e] as before, SDPt^ also 

"^We need these restrictions only to exploit the N-0 combination result. The definition of convexity and 
stably infiniteness can be found in [NO 79] . 

®We assume that each theory has an inference rule for deriving equality between variables in the theory, 
and DPt also returns a set of equality over variables. 
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returns the expression t[{6, T)], for each equahty 6 over Vsh that can be derived in step (2) 
of the SDPt algorithm. 

The leaves of the expressions t\e] and t[{5, T)] are GjUA (since leaves for Ei are replaced 
with true). We substitute the leaves for any 5 G A with the expression -f/^^, to incorporate 
the derivations of 5 until this point. We also update il's ^ (tps V t[{6, T)]) to add the new 
derivations of 6. Similarly, we update ^pe ^ {ipe V t[e]) with the new derivations. 

The N-0 loop iterates | Vsh\ number of times to ensure that it has seen every derivation 
of a shared equality over Vsh from any set G[ U G2 U i?i U i?2 1 where Gi. 

After the N-0 iteration terminates, ipe contains all the derivations of e from G. However, 
at this point, there are two kind of predicates in the leaves of ipe] the purified predicates 
and the binding predicates. If g' was the purified form of a predicate g & G, we replace the 
leaf for g' with bg. The leaves of the binding predicates are replaced with true, as the fresh 
variables in these predicates are really names for subterms in any predicate, and thus their 
presence does not affect the satisfiability of a formula. Let t[e] denote the final expression 
for Ipe that is returned by SDPtjut>>- Observe that the leaves of t[e] are variables in Bq- 

Theorem 4.2. For two convex, stably-infinite and signature- disjoint theories Ti and T2, if 
t[e] = SDPTjuT^iG, E), then for any set of predicates G' C G, eval{t[e], G') = true if and 
only if DPtjuT'AG' U E) returns UNSATISFIABLE. 

Since the theory of EUF and DIF satisfy all the restrictions of the theories of this section, 
we can construct an SDP for the combined theory that still runs in pseudo-polynomial time. 



5. Implementation and Results 

We have implemented a prototype of the symbolic decision procedure for the com- 
bination of EUF and DIF theories. To construct J-p{e), we first build a BDD (using the 
CUDD [CUD] BDD package) for the expression t[e] (returned by SDPt{P U P,E)) and 
then enumerate the cubes from the BDD. 

Creating the BDD for the shared expression t[e] and enumerating the cubes from the 
BDD can have exponential complexity in the worst case. This is because the expression for 
J-p{e) can involve an exponential number of cubes (e.g. the example in Fig [8]). However, 
most problems in practice have a few cubes in Tp{e). Secondly, as the number of leaves of 
t[e] (alternately, number of BDD variables) is bound by |P|, the size of the overall BDD is 
usually small, and is computed efficiently in practice. Finally, by generating only the prime 
implicant^ of J'p{e) from the BDD, we obtain a compact representation of J-p{e). 

We report preliminary results evaluating our symbolic decision procedure based predi- 
cate abstraction method on a set of software verification benchmarks. The benchmarks are 
generated from the predicate abstraction step for constructing Boolean Programs from C 
programs of Microsoft Windows device drivers in SLAM }BMMROT] . 

We compare our method with two other methods for performing predicate abstraction: 

: DP-based: This method uses the decision procedure zapato [BCLZ04] to enumerate 
the set of cubes that imply e. Various optimizations (e.g. considering cubes in 

^ For any Boolean formula (j) over variables in V, prime implicants of (ji is a set of cubes C = {ci 
over V such that (j) <t4> Vcgc ^'^'^ more cubes from C can't be combined to form a larger cube. 
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Figure 8: Result on diamond examples with increasing number of diamonds. The expres- 
sion e is (al = dn). A "-" denotes a timeout of 1000 seconds. 



increasing order of size) are used to prevent enumerating exponential number of 
cubes in practice. 

: UCLID-based: This method performs quantifier-elimination using incremental SAT- 
based methods [ LBC03) . The procedure works by first converting the problem into 
an existential quantifier elimination problem in first-order logic and then reducing 
it to Boolean quantifier elimination by using an encoding to Boolean logic. Finally, 
it uses SAT-based methods for performing Boolean quantification. 
To compare with the DP-based method, we generated 665 predicate abstraction queries 
from the verification of device-driver programs. Most of these queries had between 5 and 
14 predicates in them and are fairly representative of queries in SLAM. The run time of 
DP-based method was 27904 seconds on a 3 GHz. machine with 1GB memory. The run 
time of ^DP-based method was 273 seconds. This gives a little more than lOOX speedup 
on these examples, demonstrating that our approach can scale much better than decision 
procedure based methods. We have not been able to run UCLID-based method on these 
particular SLAM benchmarks; the UCLID-based tool is no longer actively maintained, and 
we had trouble translating these SLAM benchmarks to input of UCLID. From our earlier 
experience of using UCLID on similar benchmarks (Fig. 3 in |LBC03] ). we believe that 
most of these benchmarks can be solved within a few seconds, and the total runtime would 
not differ by more than 2-3X (in favor of the current technique). 

To compare with UCLID-based approach, we generated different instances of a prob- 
lem (see Figure [8] for the example) where P is a set of equality predicates representing 
n diamonds connected in a chain and e is an equality al = dn. We generated different 
problem instances by varying the size of n. For an instance with n diamonds, there are 
5n — 1 predicates in P and 2" cubes in J-p{e) to denote all the paths from al to dn. Fig- 
ure [8] shows the result comparing both the methods. We should note that UCLID method 
was run on a slightly slower 2GHz machine. The results illustrate that our method scales 
much better than the SAT-based enumeration used in UCLID for this example. Intuitively, 
UCLID-based approach grows exponentially with the number of predicates (2l-'^l), whereas 
our approach only grows exponentially with the number of diamonds (2*^) in the result. 

6. Conclusions and future work 

In this paper, we have presented the concept of symbolic decision procedures and showed 
its use for predicate abstraction. We have provided an algorithm for synthesizing a SDP for 
any bounded saturation theory. We show that such SDP exists for interesting theories such 
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as EUF and difference logic. These SDP construct a shared expression and run with poly- 
nomial and pseudo-polynomial complexity respectively. Finally, we have provided a method 
for constructing the SDP for simple mixed theories using an extension of the Nelson-Oppen 
combination framework. Preliminary results comparing it some of the existing approaches 
are encouraging. 

There are several avenues of future work, some of which are outlined below: 

• First, it is interesting to find out how to construct a SDP for other theories, including 
the theory of linear arithmetic (over rationals). For linear arithmetic, one can perform 
a "symbolic" Fourier-Motzkin [DE73j elimination procedure to construct an SDP — the 
inference rule would eliminate a variable from all the predicates in a given level. However, 
it is not clear how to generate implied equalities from such a procedure to combine the 
SDP with SDP for other theories. 

• Second, as the example in Figure [5] illustrated, there are a lot of redundant derivations 
present in the resultant expression. The algorithm will benefit from optimizations that 
can minimize such redundant derivations. 

• Extend the combination of SDPs to non-convex theories. 
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